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Abstract 

To achieve the available performance gains in half-duplex wireless relay networks, several cooperative 
schemes have been earlier proposed using either distributed space-time coding or distributed beamforming 
for the transmitter without and with channel state information (CSI), respectively. However, these schemes 
typically have rather high implementation and/or decoding complexities, especially when the number of 
relays is high. In this paper, we propose a simple low-rate feedback-based approach to achieve maximum 
diversity with a low decoding and implementation complexity. To further improve the performance of the 
proposed scheme, the knowledge of the second-order channel statistics is exploited to design long-term 
power loading through maximizing the receiver signal-to-noise ratio (SNR) with appropriate constraints. 
This maximization problem is approximated by a convex feasibility problem whose solution is shown to 
be close to the optimal one in terms of the error probability. Subsequently, to provide robustness against 
feedback errors and further decrease the feedback rate, an extended version of the distributed Alamouti 
code is proposed. It is also shown that our scheme can be generalized to the differential transmission 
case, where it can be applied to wireless relay networks with no CSI available at the receiver. 
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I. Introduction 

The performance of wireless communication systems can be severely affected by channel fading. To 
combat fading, multi-antenna systems are commonly used as in such systems, the existence of independent 
paths between the transmitter and receiver can be used to achieve a higher degree of diversity than in 
single-antenna systems [l]-[3]. However, restrictions in size and hardware costs can make the use of 
multi-antenna systems impractical in wireless networks. Fortunately, similar independent paths are also 
available in wireless networks with multiple single-antenna nodes, where some nodes are used as relays 
that help to convey the information through the network. Using such relays between the transmitter and 
receiver nodes offers the so-called cooperative diversity and, hence, can be a good alternative to using 
multiple antennas at the transmitter and/or receiver. Several cooperation methods between network nodes 
have been proposed based on different relaying strategies; see [4]-[10] and references therein. 

Among different relaying approaches, techniques using the amplify-and-forward relaying strategy are 
of especial practical interest because they do not require any signal processing (such as decoding or 
compression) at the relays. 

The use of space-time codes (originally developed for multi-antenna systems [1 1], [12]) in a distributed 
fashion has been proposed for relay networks in [4] and [13] using the amplify-and-forward approach. In 
this cooperative strategy, the source terminal first transmits the information symbols to the relays. Then, 
the relays encode their received signals and their conjugates in a linear fashion and transmit them to the 
destination node. This can be viewed as distributed space-time coding (DSTC). The DSTC techniques 
only require the knowledge of the received signal powers at the relays and can achieve the maximum 
diversity available in the network. In [14], orthogonal space-time block codes (OSTBCs) [12], [15] and 
quasi-orthogonal STBCs (QOSTBCs) [16], [17] have been used along with the DSTC strategy of [13]. 
Both these DSTC approaches have been shown to offer maximum diversity, optimal diversity products, 
low maximum likelihood (ML) decoding complexity, linear encoding of the information symbols, and 
robustness against relay failures. Unfortunately, for more than two relays, the maximum rate of OSTBCs 
reduces [18], the decoding delay increases, and the linear ML decoding complexity is no longer achievable 
[14]. Furthermore, QOSTBCs are only applicable to particular cases with certain numbers of relays. In 
addition, their decoding complexity is higher than that of OSTBCs. 

In [19], four-group decodable DSTC^j for any number of relays are proposed. Although this approach 
reduces the decoding complexity as compared to the full ML decoder, its complexity is still rather high. 



'For these codes, it is possible to split the maximum likelihood decoding problem into four independent subproblems. 
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especially in the case of more than four relays. To recover the simple symbol-by-symbol ML decoding 
property of the distributed OSTBCs for more than two relays, the use of the source-to-relay CSI at the 
relays has been proposed [14], [20]. However, as shown in [14], this does not improve the resulting 
diversity or coding gains. 

Another promising approach to amplify-and-forward relaying in wireless networks is distributed beam- 
forming; see [21]-[26] and references therein. As most of distributed beamforming techniques require the 
full knowledge of the instantaneous CSI for both the source-to-relay and relay-to-destination links and, 
moreover, require a feedback link between the destination and relays, the complexities of these techniques 
are rather high. To decrease the distributed beamforming complexity, the use of quantized feedback for 
selecting beamforming weights from a codebook has been proposed in [27]. However, the codebook 
design requires a costly numerical optimization and the resulting codebook needs to be transmitted to 
each relay every time when the channel statistics or the transmitted powers change. 

In this paper (see also [28] and [29]), we consider a wireless network where each relay only needs to 
know its average received signal power (which is a common assumption for DSTCs) and further assume 
that one-bit feedback per relay is available for every channel realization. The proposed scheme is based on 
the ideas of partial phase combining (PPC) [30], [31] and the group coherent codes (GCCs) [32] originally 
introduced for traditional multiple-antenna systems. It will be shown by means of an approximate symbol 
error rate (SER) analysis that such a low-rate feedback is sufficient to achieve maximum diversity with 
an additional coding (power) gain. Furthermore, the proposed scheme will be shown to enjoy linear 
decoding complexity and minimum decoding delay for any number of relays. Although the best possible 
choice for the feedback bits has to be found by a full search, we provide two much simpler methods to 
judiciously choose these bits. 

It should be noted that several techniques related to the proposed scheme have been developed in [33]- 
[35] in the context of sensor networks. In these papers, randomly generated relay beamformer phases 
are iteratively selected based on a low-rate feedback. In particular, in [35] the application of binary 
signaling to the approach of [33], [34] has been considered. The scheme of [33]-[35] requires multiple 
iterations to converge, and the number of such iterations is in average comparable to or larger than the 
number of sensor nodes. In contrast to the approaches of [33]-[35], the proposed scheme will use a fixed 
(substantially lower) number of feedback bits without any need for multiple iterations. 

Since the quality of channel links can vary for different relays, we propose to use second-order channel 
statistics to find proper "long-term" power loading coefficients for each relay. From the performance 
viewpoint, these coefficients should be designed by minimizing the error probability as it was proposed 



4 



in [36] for a two-relay network using the distributed Alamouti code. However, the approach of [36] does 
not provide any extension to the case of more than two relays. As an alternative, we propose to use the 
general idea of [24] to obtain the power loading coefficients by maximizing the average SNR subject to 
individual power constraints. However, in contrast to [24], the loss in diversity is avoided by a proper 
choice of the instantaneous feedback bits and by using appropriate constraints on these coefficients. It is 
shown that using semi-definite relaxation (SDR), the resulting SNR maximization problem can be turned 
into a convex feasibility problem which can be efficiently solved using interior point methods. Simulations 
show that the resulting solution performs very close to the direct (computationally prohibitive) approach 
that minimizes the Chemoff bound on the error probability using brute force optimization. 

Using an extended version of the distributed Alamouti code, we further refine the proposed technique to 
reduce the amount of feedback without affecting the benefits of linear decoding complexity and maximum 
diversity. In addition, the use of such an extended distributed Alamouti code is shown to provide extra 
robustness to erroneous feedback. 

Finally, an extension of the proposed scheme for non-coherent receivers using differential transmission 
is developed. It is demonstrated that the proposed non-coherent scheme enjoys the same advantages in 
performance, decoding complexity and delay as its coherent counterpart. 

The remainder of this paper is organized as follows. In Section |II1 the system model is developed. 
Section JII] presents the proposed scheme. Its further refinement using the extended distributed Alamouti 
code is discussed in Section |IVl The differential transmission extension of the proposed techniques is 
developed in Section |Vl Computer simulations are presented in Section IVTl and conclusions are drawn in 
Section Ivnl 

II. System Model 

Let us consider a half-duplex wireless relay network with R + 2 nodes where each node has a single 
antenna that can transmit or receive signals. Among these R + 2 nodes, one is the transmitter, one is 
the receiver, and the remaining R nodes are the relays. It is assumed that the direct link between the 
transmitter and the receiver can not be established and that the relay channels are statistically independent. 
We consider the quasi-static flat fading channel case with the block length T, and denote the channel 
coefficient between the transmitter and the ith relay by /j. Correspondingly, the channel coefficient 
between the ith relay and the receiver is denoted by gi. We assume that fi and gi are independent random 
variables with the probability density functions (pdf 's) CN{ii /, , cjj^ ) and CM {fig. , cr^. ), respectively, where 
CAf{-, •) denotes the complex Gaussian pdf. 
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We assume that the transmitter does not have any CSI. However, we consider a Umited feedback link 
between the receiver and each relay. This feedback link is used to transmit one bit for every channel 
realization and can be also used to transmit long-term power loading coefficients (one per relay) every 
time the channel means or variances change significantly. The receiver may or may not enjoy full CSI, 
depending on the transmission mode (coherent or non-coherent) and the system is synchronized at the 
symbol level. 

At the transmitter side, T symbols s = [si, . . . ^st]^ are drawn from an M-point constellation 
according to the information bits to be sent. Here, (•)^ denotes the transpose. The signal s is normalized 
as E{s^s} = 1, where (•)^ and E{-} denote the Hermitian transpose and the statistical expectation, 
respectively. The transmission is carried out in two steps. In the first step, the transmitter sends ^JPqTs 
from time 1 to T, where Pq is its average transmitted power. The received signal at the ith relay is given 
by 

r, = /F^/iS + Vi (1) 

where Vj is the noise vector at the ith relay. In the second step, the ith relay sends the signal dj to the 

receiver from time T + 1 to 2T. At the receiver, we have 

R 



X = ^ Qidi + n (2) 



where x = [xi, . . . , xtY' is the received signal and n is the receiver noise vector. We assume that the 
entries of the noise vectors Vj and n are i.i.d. random variables with the pdf CM (0, 1), that is, both these 
noises have variance = 1. 

The transmitted signal dj at each relay is assumed to be a linear function of its received signal and 
its conjugate [14], that is. 



rrif^Po + 1 



= \ ^°p'Ii WfiAiS + f*BiS*) + J ^— 6,0,(A,Vi + B,v*) (3) 

where nif^ = E{|/jp} = + ctj , bi G { — 1, 1} is a coefficient selected based on the value of the 

one-bit feedback, 6i (0 < 9i < 1) is a real-valued long-term power loading coefficient that is adjusted 
according to the channel statistics (as it will be explained in Section IIII-BI) . Pi is the maximum average 
power available at the ith relay (while the actual transmitted power is PiOf < Pj), (•)* denotes the 
complex conjugate and the T x T matrices Aj and Bj are assumed to be either Aj = O with Bj being 
unitary, or Bj = O with Aj being unitary. Here, O is the T xT matrix of zeros. This assumption implies 
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that the statistics of the noise remains unchanged and that the transmitted signal at each relay depends 
either on its received signal or on the complex conjugate of this signal. 
Using this model, let us introduce the following notations: 



If Bj = O then Aj = Aj, fi = /j, v = v^, = s 
If Ai = O then A, = B^, fi = /*, v = v*, = i 
Taking into account Q-©, the received signal model ([2]) can be written as 

X = S(p0h) +w 

where 

S = [Aisi, . . . , Arsr] 
is the distributed space-time code matrix, 

h=[hi,..., hjif = [figi, fRgnf 

is the equivalent channel vector. 



w 



[Wi, ...,WT 



R 

1=1 



Pi 



nif^Po + 1 



biOigiAiVi + n 



is the equivalent noise vector. 



PqPiT 
rufTo + 1 



PqPrT 
mj^Po + 1 



and denotes the Schur-Hadamard (element-wise) matrix product. 



(4) 
(5) 

(6) 



(V) 



(8) 



III. The Proposed Cooperative Transmission Scheme 

In this section, we address the problem of selecting the coefficients &j (i = 1, . . . , i?) and the long-term 
power loading coefficients 0j (i = 1, . . . , R). We assume that the value of p h is known at the receiver 
and there is a perfect (error-free) low-rate feedback link between the receiver and the relays. We will first 
introduce the transmission strategy based on one-bit feedback per relay to choose the coefficients bi for 
every channel realization. It will be shown that this transmission scheme achieves maximum diversity. 
Subsequently, a further improvement of this scheme will be considered using an additional long-term 
real-valued power loading coefficient to feed back from the receiver to each relay. These coefficients will 
be computed using second-order channel statistics. 

For the sake of simplicity, in this section we assume that T = 1. Hence, matrices Aj and Bj become 
scalars and it is assumed that A, = 1 and Bj = 0. Correspondingly, x, Vj, n, w and s become scalars 
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as well. A more general case when T > 1 (and when Aj and Bj are matrices rather than scalars) will 
be considered in Section ITVl 

A. Using One-Bit Feedback Per Relay 

As in the case of one-bit feedback the long-term power loading is not taken into account, all the relays 
transmit with the maximum power Pi (i.e., 9i = 1 for i = 1, ... , R). In this particular case, the received 
signal model Q reduces to 

X = l^(p h)s + (9) 

where Ir is the R x I column vector of ones. For the sake of simplicity, the sub-indices in all scalar 
values are hereafter omitted. 

Using (|7]) and (|9l), the noise power can be expressed as 

Ft Ft 

p. = E{|np} + Y: p\ E{\v.\'}blh\' = 1 + E p\M - (10) 
^ nif^Po + 1 ^ nT-f^Po + 1 

1=1 ■'^ 1=1 •'^ 

From ([Tol l it is clear that the choice of bi does not affect the noise power. Using Q, the signal power 
can be obtained as 



P, = |lij(p0h)|^E{|.|^} 



R 



1=1 



R R 



Yl Pi^i \f^9i\^+Yl Pi,Mj^^ { fi9if*9* } (11) 



i=l i,i=l 



where 



for i,j = I,... ,R, and Re{-} denotes the real part operation. In general, /3 can take negative values. 
Clearly, such negative values of /3 will reduce the received SNR and affect the achieved diversity. Our 
key idea here is to use the coefficients bi to ensure that /3 is always non-negative. It can be proved using 
the same approach as presented in [32] that using values of bi G { — 1, 1} is sufficient to guarantee /3 > 0. 
This results in a scheme with the diversity order proportional to R, as stated in the following proposition. 
Proposition 1: If (5 >0, then the average symbol error probability (SER) for Q can be upper bounded 

by 

SER < kP"-^^^"^""^^ (13) 
for large R and large P, where P is the total power in the network and k is a constant. 
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Proof: See Appendix. 

It follows from Proposition 1 that the achievable diversity order of the proposed scheme is R. 

Since positive values of /? will provide an additional signal power gain, the optimal values of hi 
(i = 1,. . . ,R) can be obtained through maximizing (3. This is an integer maximization problem that 
requires a full search over all possible values of 5j. Clearly, if the number of relays is large, then such a 
full search procedure can be impractical. To reduce the complexity, we propose a near-optimal solution 
based on SDR, that we denote hereafter as Algorithm 1. 

Note that, according to (fTTI ). the choice of 6j does not affect the value of 7. Therefore, to maximize 
Ps, it is sufficient to maximize f3 in (fTTI) . Let us express Ps in a more convenient form by extending the 
notation for pij in (fT2l) with 



A / PiPoT 

Pifi-- 



mj^Po + 1 



and denoting 



^ = [pi,ofi9i,---,PR,ofR9Rf ■ (14) 
Using (fT4l) . the signal power ([TT]) can be expressed as 



Ps = |h^b|2 



where b = [bi, . . . , bn]'^. Defining Q = hh^, we can write the optimization problem as 

max b^Qb. (15) 

be{-i,i}« 

As b^Qb = tr(bb^Q), the optimization problem in (fTSl) can be rewritten as 

max tr(BQ) s.t. rank{B} = 1, B ^ 0, [B] .. = 1, i = 1, . . . , R (16) 

B 

where B = bb^, b € M^, tr(-) stands for the trace of a matrix, R denotes the set of real numbers, 
and [B]jj denotes the ith diagonal element of B. Problems similar to ( fT6l ) arise in the context of ML 
detection. Solutions close to the optimal one can be efficiently found using the SDR approach [37], whose 
essence is to omit the rank-one constraint rank{B} = 1 in (fT6l) and, therefore, approximate the latter 
non-convex problem by a convex problem 

maxtr(BQ) s.t. B ^ 0, [B] = 1, i = 1, . . . , i2. (17) 

B 

Note that this problem can be efficiently solved using interior point techniques [37]-[39]. Generally, the 
resulting solution for B is not guaranteed to be rank-one. If it is rank-one, then its principal eigenvector 
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is the optimal solution to (fTSl ). Otherwise, a proper approximate solution for 6 can be recovered from B 
using randomization techniques, see [37] and [39] for more detail. 

Thus, our SDR-based approach can be summarized as follows. 

Algorithm 1 

1. At the receiver, find the solution to dTTb using the approach of [37]. 

2. Send the so-obtained hi from the receiver to the ith relay node for each i = 1, . . . ,R using one-bit 
per relay feedback. 

As it will be shown throughout our simulations, the use of the SDR approach results in a performance 
that is very close to that of the full search-based approach. The complexity of the SDR approach is much 
lower than that of the full search; see [37] for details. 

To further reduce the complexity, let us discuss a simpler algorithm to obtain acceptable values of hi 
that can be formulated using the general idea of [32]. The essence of this algorithm is to use a greedy 
selection of the values of hi in a consecutive way. This algorithm can be summarized as the following 
sequence of steps: 

Algorithm 2 

1. Set hi = 1 and ti = hi. 

2. For i = 2, . . . , R, compute 

hi = sign(Re {h*Ti-i}), n = n-i + hhi 

where sign(-) is the sign function. 

3. Send the so-obtained hi from the receiver to the ith relay node for each i = 1, . . . , i? using one-bit 
per relay feedback. 

Note that Algorithm 2 does not result in the optimal values of 6j, i = 1, . . . , i?. However, Algorithm 2 
is computationally much simpler than Algorithm 1. Hence, these two alternative techniques are expected 
to provide different performance-to-complexity tradeoffs. 

B. Choosing Long-Term Power Loading 

So far, we have not considered the use of power loading, 6i, for each relay. In practical scenarios, 
relays are distributed randomly in an area between the transmitter and the receiver. As a result, the 
power loss characteristics in the source-to-relay and relay-to-destination links are different for each 
relay. Furthermore, different relays may have different transmitted power constraints. Therefore, in such 
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scenarios, some power loading strategy should be employed to take into account such differences in 
channel quality and/or constraints on the relay transmitted power. 

From the performance viewpoint, the optimal power loading should be designed by minimizing the error 
probability as proposed in [36] for a two-relay network using the distributed Alamouti code. However, 
the approach of [36] does not provide any extension to the case of more than two relays. As an alternative 
to the error probability criterion, we propose to use the general idea of [24] to obtain the power loading 
coefficients by maximizing the average SNR subject to individual power constraints. 

In what follows, the maximization of the average received SNR is used as a criterion to design the 
power loading coefficients 6i. Note that a related strategy to choose the beamforming weights was also 
used in [24]. However, we will show that in contrast to [24], full diversity can be achieved in our case 
by using the optimal feedback values bi along with the coefficients Oi. 

First, let us evaluate the average signal power, that is 

E{PJ = E{7} + E{/3} (18) 

where 

R 

1 = Y.Pi'^^^^\f^a^? (19) 

i=l 
R 

P= 5^ /'^J^i^J&^&iRe{/.5i/;5;}• (20) 

Note that in (|T8] ). the analytical evaluation of E{/3} is very difficult due to the dependence of bi 
(i = 1,... ,R) on the instantaneous channel values. Therefore, using ([T9l)-(l20l) and assuming that the 

optimal bi{i = 1, . . . , R) are selected, we propose to approximate ([TS]) as 

R 

E{Ps} ~ Yl P^A^jl^^ {Wimfjdj}} I- (21) 

The quality of this approximation is illustrated in Fig. [1] where the exact value of E{Ps} and its 
approximation (|2TI ) are plotted versus P normalized by the noise variance a'^. In this figure, it is assumed 
that R = 4 and 9i = 1 (i = 1, . . . , R). All the channels are assumed to be complex circular Gaussian 
random variables with zero-mean and unit variance. 

Another important question when using this approximation is how close the values of achieved average 
SNR obtained from the approximation in (|2TI ) and from the exact value of E{Ps} are. This question 
was investigated by means of extensive Monte-Carlo simulations that, for the sake of brevity, are not 
presented in all detail. These simulations have involved different channel scenarios, randomly generated 
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channel coefficients for each particular scenario, and different numbers of relays lying in the interval 
R = 2, . . . ,7. The optimal coefficients 9i have been obtained by brute force optimization of E{Ps} and 
their approximate values have been found by optimizing (|2T]) . Then, the achieved average SNRs were 
compared for two so-obtained sets of optimized power loading coefficients. 

The results of this comparison have verified that the difference between the exact optimal SNR 
(computed numerically via brute force optimization of E{Ps}) and its approximation computed via (|2TI ) 
is, in average, less than 3%. This implies that the approximation (|2TI ) is worth using for maximizing the 
average SNR by power loading. 

Using the statistical independence of all source-to-relay and relay-to-destination channels, we can now 
estimate the expected value in (|2TI ) as 



E{figif*g*} = E{fif*}E{gig*} = {nf^fi}^ + 6ijal){fig^fi*g^ + Sija'^gJ ill) 

where 5ij is the Kronecker delta. 

Let us define the real-valued matrix Q with the (i, j) entry as 

[Q]* J = P» J Re { (/^/. /^/, + '^ii^/. ) (^3, Mg, + ^ij'^l, ) } (23) 
for i, j = 1, . . . , i?. Using (l22l ) and (1231 ). equation (|2TI ) can be written as 

E{Ps} w 6>^Q6> (24) 

where 6 = [^i, . . . , 6*/?]^. Using the fact that the noise waveforms and the channel coefficients are 
statistically independent, the noise power can be expressed as 

E{P^] = E{|np} + X: nW?}E{\g^'} = l + f: 

1=1 •> 1=1 ■' 

and further rewritten as 

E{p^} = e^we + 1 (25) 



where 



and diag (•) denotes a diagonal matrix. Using (l24l) and ( [25l l. the maximization of the receiver SNR over 
can be approximately written as 

e^oe , 



where instead of the signal power we use its approximation given by (I2TI ). 
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If the aggregate power constraint (6^6 = R) is used instead of the individual relay power constraints 
in (l26l ). the resulting problem becomes 

°T aT(w!"?/fl)i,))8 ^'^ = 

Solving ( [27] ) amounts to the unconstrained optimization of the objective function in ( |27] ) (that boils down 
to solving a generahzed eigenvector problem) followed by rescaling the so-obtained vector 6 to satisfy 
the constraint 6^6 = R. 

In what follows, we consider a more practical case of individual power constraints rather than the 
aggregate power constraint. 

As mentioned above, the design of power loading coefficients by maximizing the average SNR does 
not take into account the diversity aspect of the problem. In fact, maximizing the average SNR can result 
in a solution with a poor performance in terms of the error probability. This can particularly be the case 
if some of the resulting values of 9i axe small, so that the diversity order suffers. Indeed, if 6i is close to 
zero at the ith relay, then this is equivalent to switching off the ith relay for all the transmissions within 
the time interval where the current value of 9i is used. According to Proposition 1, this will reduce the 
diversity order. 

To prevent such a loss in diversity, an additional constraint 9f > Of can be used where Of is a 
preselected minimum power loading value that establishes a tradeoff between the diversity and power 
loading performance. If Oi is chosen too large, then the interval for 9i will be smaller, and this may 
prevent the scheme from achieving any significant improvement in the performance due to power loading. 
Reversely, if Oi is chosen too small, a substantial diversity loss can occur. 

Defining = 06^ , we can rewrite ( |26l ) as 

y tr(W0K 1 . = l,...,i? 

rank{0} = 1, 0^0 (28) 

where 0^0 means that is positive semi-definite. Introducing the auxiliary variable t, (l28l ) can be 
written as 

max t s.t. tr(0(Q - tW)) > t 
&,t 

Of<[&U<l, i = h...,R (29) 
rank{0} = 1, 0^0. 
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As the rank constraint in ( |29l ) is non-convex, this optimization problem can not be solved efficiently. 
Using the SDR approach (i.e., ignoring the constraint rank{0} = 1 in ([29l)). a quasi-convex optimization 
problem can be obtained from ( [291 ) that can be directly solved using the bisection technique [24], [38]. 
Based on the latter technique, the optimal value topt is found in the interval [tiowi^up]. where tiow is a 
feasible value and therefore, topt > ^low. and is not a feasible value and therefore, topt < ^up- The 
algorithm solves the feasibility problem 

find s.t. tr(0(Q - tW)) >t, 0^0 

ef<[@]u<l, i = l,...,R (30) 

at the midpoint of the interval, t = (tiow +iup)/2. If it is feasible, tiow is updated as tiow = t- If it is not 
feasible, tup is updated as tup = t. Then, the algorithm continues to solve the feasibility problem with 
the new interval until tup — ^low < e> where e is a parameter denoting the acceptable tolerance of the 
solution. The optimal matrix 0opt is selected as for the last feasible t, (i.e., t = tiow in the last step). 
If the matrix 0opt is rank-one, then its principal eigenvector is the optimal solution to ( [26] ). If 0opt is 
not rank-one, then a proper approximate solution for 6 can be obtained using randomization techniques 
[39]. 

IV. Extended Distributed Alamouti Code 

The scheme developed in the previous section applies to the case of T = 1. In what follows, we extend 
it to the case of T = 2 by developing an approach based on the distributed Alamouti code to reduce the 
total feedback rate. Using computer simulations, the latter scheme will be shown to provide robustness 
against feedback errors. Such improvements in the feedback rate and robustness are, however, achieved 
at the price of an increased decoding delay and a moderate performance loss as compared to the case of 
T = 1. 

Let us consider the case of an even number of relay i.e., let R = 2K where K is some positive 
integer. The distributed Alamouti code is used by relay pairs. Let each kth relay pair receive a low-rate 
feedback to select the binary coefficient 6^ S {—1, 1} and the real-valued power loading coefficient 
9k E [0, 1]. Since the same bk and 9^ should be used by the two relays of the A:th relay pair, the receiver 
can broadcast them to both these relays, thereby reducing the feedback rate almost by half. 

^The case of an odd number of relays can be addressed in the same way and, therefore, is omitted below. 
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The relays use the basic distributed Alamouti code matrices [14] to form the signal transmitted by 

To -1 1 

each relay pair as: A2fc-i = I2 (with B2fc_i = O) and = with = O, where I2 is 

1 

the 2 X 2 identity matrix. Using ([S])-®, we obtain the following distributed space-time code matrix for 



the proposed scheme: 



S — [Sq, • • • , Sfl] 



(31) 



where Sq 



si 



S2 



The channel and relay 



is the conventional Alamouti code matrix 
30wer vectors are given by 

h = /2ff2, • • • , f2K~ig2K-l, f2K92K]^ 

P= [Pl,P2,---,P2K-l,P2Kf = 



T 



(32) 
(33) 



respectively. 

Note that in contrast to ([U, any {2k — l)th and (2fc)th relays use the same 1)^9^. 
Conjugating the second entry X2 of the vector x = [xi, 2:2]"^ in dSjl and using ([3T])-(|33]|. we obtain the 
following equivalent model 

X = Hs + w 



(34) 



where x = [rri, rrg] , s = [s\, Sg] > ^ — [^i; '^^2] > 

K 

H = Hfc and Hfc = 



k=\ 



P2k~\^2k~\ — P2fc^2fe 
P2khl^ P2fc-l/i2fc^l 



Note that 



where 



H^H 



7a + /3a 

7a + /3a 



7a = ||(p0h)r (35) 

K 

Pa— ^ Oibi0jbjRe{p2i-l,2j-lh2i~lh2j^i + P2i,2jh2ih2j}- (36) 

Throughout (l3TI)-(l36l). the subindex (•)« stands for the extended Alamouti scheme. 

Since the matrices A^ satisfy the property AjA^ = It, the noise covariance matrix = E{ww^} 
is a scaled identity matrix. Therefore, the ML decoding 



aremin llx — Hsl 
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reduces to simple symbol-by-symbol decoding. 

As l/ifcp = l/fcSfcP, it is clear from (|35] ). (l36l ) and Proposition 1 that the maximum diversity can be 
achieved if (3a > 0. Similar to [32], it can be proved that if G { — 1, 1}, it can be guaranteed that 

Pa > 0. 

As in Section [nil the coefficients bk can be selected using the exhaustive full search, a suboptimal SDR 
approach similar to Algorithm 1, or an iterative procedure similar to Algorithm 2. To develop the SDR 
approach for the extended distributed Alamouti code case, we define the A' x 1 vector = . . . , bx]^ 
and the 2 x K matrix 

Plh Pshs ■ ■ ■ P2K~lh2K~l 

\J I ) 

P2h2 Pihi ■■■ P2Kh2K 

Using (|37] ). we obtain that 

-ia + (ia = blF^Fb,. (38) 

Defining Qa = F^F, we can write the problem of optimal selection of the coefficients 6^ (/c = 1, . . . , K) 
as 

max hlQaha. 

b„e{-i,i}"^ 

Using the notation = b^bj, this problem can be rewritten as 

max tr(BaQa) s.t. rank{Ba} = 1, B^ ^ 

[B,],,, = 1, k = l,...,K (39) 

and using the SDR approach, it can be approximately converted to a convex form 

max tr(B^Q„) s.t.B^ ^ 0, [BJ^- ^ = I, k = I, . . . , K (40) 

by omitting the rank-one constraint rank{B(i} = 1 in ( [39l ). 

The SDR-based algorithm for the proposed distributed Alamouti approach be summarized as follows. 
Algorithm 3 

1. At the receiver, find the solution to (l40l) using the approach of [37]. 

2. Send the so-obtained b^ from the receiver to the A;th relay pair for each k = I, . . . , K using one-bit 
per relay pair feedback. 

In turn, the greedy algorithm of Section JII] can be modified as follows. 
Algorithm 4 

1. Set 6i = 1 and n = [hi /i2]^. 



F^ 
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2. For k = 2, . . . , K, compute 

bk = sign(Re { [/i2fc„i hlf?\Tk-i]) , Tk = tu-i + 6fc[/i2fe-i h2kf . 

3. Send the so-obtained from the receiver to the /cth relay pair for each k = 1, . . . ,K using one-bit 
per relay pair feedback. 

To derive the power loading coefficients 9^, an approach similar to that presented in Section IIII-BI 
can be applied. We first develop an approximation to the expected value of the signal power and then 
maximize a lower bound on the SNR. Using (l34l ). the average signal power can be written as 

E{P,}=E{74+E{/34. (41) 

Using (|35] ). (l36l ) and the same arguments as in Section UlI-BI E{Ps} can be approximated as 

K 

-l,2i-1^2i-1^2i-l + P2i,2jh2ih*2j}}\ (42) 

where it is assumed that the optimal values of 6j (i = 1, . . . , K) are selected. 
The expected value of the noise is given by 

E{P.} = 1 + E f P^^-^rn ^ . (43) 

^ \Pomf,,_, + 1 Poruf,, + I J 

Using (l42l) and (|43] ). the SNR maximization problem can be approximated as 

^"^^^i^l'^ = l'-'^ (44) 

where 0a = [^i • • • ^i^]^, 

Wa = diag > 5— T'---' i-TT 

\l=l " l=2K-l ^ ' / 

Qa is a K X K matrix with the entries 

[Qa]i,j = |Re {p2i-l,2i-lE{/l2i-l/l2j_l} + P2i,2iE{/l2i/l2j}}| (45) 

and 9k constrains the coefficients 9^ to prevent diversity losses in a way similar to that described in 
Section |llll Now, the expected value in (l45l) can be estimated using the statistical independence of the 
channels as in (l22l) . In particular, for the (2z, 2j)th factor in ( [451 ). we have 



E{h2^^J} = [f^h.P%+^i2r){23)<^f,,) [fig.^^l^ + S^2i)i2j)<^g,^) ■ 

Following the same steps as in Section [nil the optimization problem in (l44l) can be turned into a 
convex feasibility problem that extends (l30l ) to the distributed Alamouti coding case. 
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V. Differential transmission 

The concept of differential transmission is used in this section to extend the proposed approach to the 
case where no CSI is available at the receiver. Let us assume that T = 1 and let the transmitter encode 
differentially the information symbols si selected from some constant-modulo constellation S as 

Ul = Ul^iSl, Mo = 1 

where ui and uq are the current and initial transmitted symbols, respectively. Similar to the coherent 
scheme in (|9ll, we have 

xi = l]i{pQh)ui + wi. (46) 

Using (l46l ) and the previous received signal xi^i, the ML symbol estimate can be obtained from 
maximizing [1] 

Re{2;;_ixfs/} 

over si G S. We assume that no power loading is used, i.e., set 6i = 1 for i = 1, R. Since the 
receiver has no CSI to select the feedback bits for 6j (i = 1, . . . ,R), the following simple sequential 
feedback bit assignment scheme can be used. Before the beginning of the frame in which the information 
symbols should be transmitted, there is an extra transmission stage to select the coefficients hi. First, uq 
is transmitted from the source to the relays and then it is retransmitted by the relays to the destination 
with 6j = 1 (i = 1, . . . ,R). Then, the second relay only alters its coefficient 62 to —1 and the relays 
retransmit again. The received powers corresponding to the latter two relay-to-destination transmissions 
are compared at the receiver and the receiver sends one bit of feedback. This bit is used by the second 
relay to select 62 that corresponds to the maximum received power. The process continues with the 
remaining relays in the same way. This makes it possible to select all the coefficients bi (i = 2, . . . , R) 
in a sequential (greedy) way. After the process of selecting the coefficients bi is completed, the source 
starts the transmission of its information symbols. The overall transmission strategy can be summarized 
as follows: 
Algorithm 5 

1. Set bi = 1, i = 1, . . . , R. Transmit uq from the source to relays and then retransmit it from the 
relays to the destination to obtain xi = l^(p h.)uo + wi at the receiver. 

2. For j = 2, . . . , R: 

• At the jth relay, set bj = —1 and, using Q, update the signal dj to be transmitted from this 
particular relay. 
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• Transmit signals from all relays to obtain Xj = l^(p h)uo + Wj at the receiver. 

• If jxjp > then feed "1" from the receiver back to the relay; otherwise feed "0" back 
to the relay. In the latter case, set xj = xj-i. 

• If the received feedback at the jth relay is 1, then select bj = — 1. Otherwise, select bj = 1. 
Similarly, a differential modification of the extended distributed Alamouti code of Section |IV] can be 

developed in the case when T = 2. At the transmitter, a unitary matrix S; should be formed from the 
constant-modulo information symbols S2i-i,S2i as 



1 

V2 



S21-1 



S2l 



'2« 



'2Z-1 



Let the differential encoding 



U; = SiUi_i 



be used at the transmitter. It amounts to sending the vector = [u2i-i,U2i]'^ instead of = [s2/-i, S2i-i]'^ 
to the relays where / denotes the transmitted block number. The first vector can be chosen as 
uq = [1,0]^. Similar to Q and using the matrices A2k~i and A2k defined in the previous section 
for the extended distributed Alamouti code, the following equivalent relation can be obtained 



SiUi-i ^J^Pfc ©hfc^ + 



where U, 







U 



Pk 



-U. 



21-2 



U21-2 U^l^s 



PoP2k-iT 



I > 1 



[f2k-l92k-l, f2k92kV 



I PoP2kT 
mf,,Po + 1 



■bkOk 



The ML decoding amounts to maximizing [1] 

Re{tr(x;_ixfS«)} 

over S2i-i,S2i G S. Note that the detection can be done symbol-by-symbol. As in the previous scheme 
without DSTC, we set 9i = 1 and use a similar strategy to select the coefficients 6j using relay pairs and 
blocks of length T = 2. This strategy can be summarized as follows: 
Algorithm 6 
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1. Set bi = 1, i = 1, . . . , K . Transmit uq from the source to relays and then retransmit it from the 
relays to the destination to obtain xi = Uq (Ylk=i Pk © + wi at the receiver. 

2. For j = 2, . . . , K: 

• At the (2j — l)th and (2j)th relays, set bj = —1 and, using ([3]), update the signals d2j^i and 
d2j to be transmitted from this particular relay pair. 

• Transmit signals from all relays to obtain Xj = Uq ^X^^Li Pk © + Wj at the receiver. 

• If ||xj|p > ||xj_i|p, then feed "1" from the receiver back to the relay; otherwise feed "0" back 
to the relay. In the latter case, set Xj = Xj„i. 

• If the received feedback at the (2j — l)th and (2j)th relays is 1 then select bj = — 1. Otherwise, 
select bj = I. 

Similar to Algorithms 2 and 4, Algorithms 5 and 6 are suboptimal. However, the latter two algorithms 
do not require any CSI at the receiver and, moreover, our simulations demonstrate that they achieve 
maximum diversity. It is also worth noting that this diversity benefit is achieved at linear decoding 
complexity. 

VI. Simulations 

Throughout our simulation examples, the QPSK modulation is used and the channels are assumed to 
be statistically independent from each other. In all but the fourth example, we consider all the channels 
to be complex circular Gaussian random variables with zero-mean and unit variance and assume that 
9i = 1, i = 1, . . . , R (which are the optimal power loading coefficients in this case). For the sake of 
fairness of our comparisons, only techniques that do not need the instantaneous CSI at the relays are 
tested. Unless specified otherwise, the feedback is considered to be error-free. 

In the first example, we compare the bit error rate (BER) performances of the algorithms that select 
the coefficients bi using the cooperative transmission scheme of Section IIII-AI with R = 20 relays and 
the same maximum power Pq = . . . = Pr = P/{R + 1). In this example, the full search-based (optimal) 
algorithm is compared with Algorithms 1 and 2. Fig. |2] displays the BERs of these algorithms Pja^. 
It can be seen from this figure that the SDR-based approach (Algorithm 1) performs about 1 dB better 
than the iterative procedure of Algorithm 2. The performances of the optimal full search algorithm and 
Algorithm 1 are nearly identical. 

In our second example, the performances of the cooperative transmission schemes of Sections IIII-AI 
(Algorithm 1) and |IV] (Algorithm 4) are compared with that of the best relay selection (BRS) scheme, 
the distributed beamforming approach of [27] with quantized feedback, and the distributed version of the 
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QOSTBC [14]. In the BRS scheme, the destination selects the relay that enjoys the largest receive SNR. 



this knowledge to normalize the transmitted signal so that the average transmitted power of the ith relay 
is Pi. It can be readily shown that this corresponds to the following relay selection rule at the destination: 



Throughout this example, R = A and the source and relay powers are chosen from the optimal 
power distribution for DSTC [14] as Pq = P/2 and Pi = P/{2R) {i = 1,...,R). For the sake of 
fairness, the distributed beamforming algorithm of [27] was implemented without the knowledge of 
the instantaneous channel fi at each ith relay using the generalized Lloyd and genetic algorithms. The 
beamformer codebooks required in the technique of [27] have been designed for the cases of one and 
three feedback bits. Fig. [3] displays the BERs of the techniques tested versus Pja^. 

Note that the distributed QOSTBC technique does not require any feedback, whereas the BRS technique 
requires two bits of feedback, and the Algorithms 1 and 4 require three and one bits of feedback, 
respectively. However, the distributed QOSTBC approach requires a more complicated decoder and 
imposes the decoding delay of T = 4. 

It can be clearly seen from this figure that both Algorithms 1 and 4 substantially outperform BRS, 
distributed QOSTBC, and the distributed beamforming approach of [27] with one-bit feedback. Also, 
Algorithm 1 outperforms Algorithm 4 with the performance gain of more than 2 dB at the cost of a 
higher feedback rate. The performances of Algorithm 1 and the approach of [27] with three bits of 
feedback are nearly identical. However, it should be noted that the codebook design in the technique 
of [27] represents a rather difficult optimization problem, and that this codebook has to be completely 
redesigned and resent to the relay nodes whenever the channel statistics or the transmitted powers change. 
This makes the implementation of the beamformer of [27] substantially more difficult than that of our 
algorithms. 

Fig. |4] compares the performance of our Algorithms 1 and 4 with that of the BRS technique and the 
beamformer of [27] in the erroneous feedback case. All the other parameters are the same as used in the 
previous figure. From Fig. |4] we observe that our algorithms are less sensitive to feedback errors than 
BRS and the approach of [27]. 

In our third example, i? = 4 is chosen. In this example, the performance of the differential techniques 
developed in Section |V] is compared to that of the Sp(2) DSTC of [40], the coherent distributed QOSTBC 



The relays only have the knowledge of their average receive power E{|rjp} 



mj^Po + 1 and they use 




2p. ■ 



(47) 
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of [14] and the BRS technique with differential transmission in which the relay with the largest received 
power is selected. Note that both the Sp(2) DSTC and the coherent distributed QOSTBC do not require 
any feedback, whereas the BRS approach requires a total of two feedback bits. It should be also stressed 
that unlike our differential schemes and the other schemes considered in this example, the coherent 
distributed QOSTBC requires full CSI at the receiver. The symbol rates of the Sp(2) DSTC, the coherent 
distributed QOSTBC and the BRS technique are the same as that of our differential techniques and are 
equal to 1/2 symbols per channel use. Important advantages of our technique w.r.t. the Sp(2) DSTC are 
lower decoding complexity, shorter required channel coherence time, and lower decoding delay. 

For the Sp(2) code, we use the 3-PSK constellation for the first two symbols and the 5-PSK constellation 
for the other two symbols. With that, a total rate of 0.9767 bits per channel use (bpcu) is achieved. The 
other schemes use the QPSK symbol constellations to achieve the total rate of 1 bpcu. 

Fig. [5] compares the block error rate (BLER) performance of the techniques evaluated versus Pja^. 
The values of BLER are computed using blocks of four symbols. As can be observed from Fig. |5l both 
Algorithms 5 and 6 outperform the Sp(2) code and the differential BRS approach, and their performance 
is close to the distributed QOSTBC (which requires the full CSI knowledge). In particular, it can be seen 
from this figure that the proposed techniques have approximately the same (maximum) diversity order 
as the Sp(2) code and the distributed QOSTBC with coherent decoder. 

These improvements come at the price of three bits and one bit of feedback for Algorithms 5 and 
6, respectively. Also, note that Algorithm 5 uses a total of 2R auxiliary time-slots before starting the 
transmission of information bits, while Algorithm 6 uses 'iK + 1 time-slots (one time-slot for each 
feedback bit). On the other hand, the Sp(2) code uses 2R auxiliary time-slots. 

In our fourth example, the performance of Algorithm 1 combined with long-term power loading (which 
is developed in Section IIII-BI ) is compared with Algorithm 1 of Section IIII-AI and with the analytical 
results obtained from ( |48] ) by means of brute force optimization. For long-term power loading, the 
approach of (l30l ) with bisection search is used. In this example, i? = 4 and 9i = . . . = 6r = 6, where 
the nearly optimal value of 9 = 0.1 has been chosen. The relay locations have been uniformly drawn 
from a circle of normalized radius 0.5, while the distance between the source and destination is equal 
to 2; see Fig. [6] that explicitly clarifies the geometry. The values of mj^ and nig^ depend on the distance 
from the transmitter to the ith relay, where mj^ = nig. = 1 in the center of the circle. We assume 
that the path-loss exponent is equal to 3. The performance is averaged over random channel realizations 
whereas the relay locations are kept fixed over all simulation runs. Both the line-of-sight (LOS) and non- 
LOS (NLOS) scenarios are considered and equal maximum powers of the transmitter and relay nodes 



22 



(Pq = Pi ■ ■ ■ = Pr = P/{R+ 1)) are taken. In the LOS channel case, it is assumed that (j)f. = (j)g. = I 
where (t)f^ = \fif^\'^/aj^ and cl)g^ = l/i^J^/cr^^. 

In Fig. |7J the BERs of the algorithms tested are shown versus P/a"^. As can be clearly seen from the 
figure, the proposed approach with long-term power loading achieves nearly the same performance as 
predicted by (1481 ) and substantially outperforms Algorithm 1 without power loading. 

In our fifth example, we compare the performances of Algorithm 1 of Section IIII-AI and Algorithm 4 
of Section |IV] in the cases of perfect and imperfect feedback. In this example, R = 4, Pq = Pi . . . = 
Pjl = P/{R + 1) and the feedback error probabilities Pg = 10^^ and Pg = 10""^ are assumed. 

Fig. [8] displays the BERs of the methods evaluated versus P/a^. As can be observed from this figure, 
the performance of Algorithm 1 becomes sensitive to feedback errors when the BER values are smaller 
than the feedback error probability itself. Therefore, as the same link quality can be normally expected in 
both directions, the performance of Algorithm 1 should not be significantly affected by feedback errors. 

It can also be seen from Fig. [8] that, in contrast to Algorithm 1 , the performance of Algorithm 4 is 
not sensitive to feedback errors. The latter fact can be explained by the spatial diversity of the Alamouti 
code. 

VII. Conclusions 

A new approach to the use of a low-rate feedback in wireless relay networks has been proposed. It 
has been shown that our scheme achieves the maximum possible diversity offered by the relay network. 
To further improve the performance of the proposed scheme in practical scenarios, the knowledge of 
second-order channel statistics has been used to obtain long-term power loading coefficients by means of 
maximizing the receiver signal-to-noise ratio with proper power constraints. This maximization problem 
has been shown to be approximately equivalent to a convex feasibility problem whose solution has been 
demonstrated to be close to the optimal one in terms of the error probability. To improve the robustness 
of our scheme against feedback errors and further decrease the feedback rate, an extended version of 
the distributed Alamouti code has been developed. Finally, extensions of the proposed approach to the 
differential transmission case have been discussed. 

Simulations have verified an improved performance-to-feedback tradeoff of the proposed techniques as 
compared to other popular techniques such as distributed QOSTBC of [14], best relay selection method, 
distributed beamforming technique of [27] with quantized feedback, and the Sp(2) distributed code of 
[40]. 
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Appendix 
Proof of Proposition 1 

The symbol error probability (SER) for Q is given by [41] 



where ci and C2 are two constants that depend on the constellation used, and Q{x) = ^ e~^^^'^dt. 
Using the Chemoff bound, we have 



SER< ^E^.,g.{e-^^^}. (48) 

Note that if we establish an upper bound for 13 = 0, then it will be also valid for any /3 > 0. This 
follows from the fact that Q{x) is a monotonically decreasing function. Using this fact, let us obtain an 



upper bound on SER by using the particular value (3 = 0. Then, from (1481 ) we obtain 



SER< ^E/.,g.{e-^^^}. (49) 

First, let us calculate the expected value over the channel coefficients /j. As these coefficients are 
statistically independent, each term in the sum for 7 can be calculated independently. Using the complex 
Gaussian pdf for fi 

P/.(/0 = ^e-l-^'^'^^'l^/'^?. (50) 



and defining 



we obtain from (|49l) that 



where 



A C2\gi\'^pi,i 

SER< |-Eg.|j]Ti| (52) 



1 

4 ' / e^^'l^'I'-l^'-^'^- (53) 



After straightforward manipulations, (1531 ) can be rewritten as 



dfi. (54) 



The function inside the integral in (l54l ) is equal to the complex Gaussian pdf CJ\f (^^7-^ 
Therefore, the integral in (l54l ) is equal to one and we obtain that 
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where (pj^ = Ifif.]"^ /a'j,. An upper bound approximation for the expected value in (l52l) can be derived 
as follows. Since Oj > 0, we have that < ^'Ji'*^i < 1- Therefore, Tj can be upper-bounded as 

Ti < l/{aial + 1) and 

Let us characterize the power of each transmitting node Pi {i = 0, . . . , R) as a fraction Pi = XiP of 
the total power P = X^^q^*' where YliLo^i = 1- If ^ is large, then, according to the law of large 
numbers, 

^ mf^Xo + P~^ 

where the inequality is satisfied in the almost sure sense. 



a = max 



i=i,...,R \mf.\o + P^^ 

and rrig. = E{|(7jp} = \iJ-g.\^ + cr^.. Therefore, from dTOl ) and dSTT ). we have that 

1 1 1 

< . 2,1. • (56) 



Using (fT2l ) and (l56l ). from (l55l ) we obtain that 



where 



a,; 



2(m/,Ao + l/P)(l + i?a) 
From (|57] ) and the fact that all the channel coefficients are statistically independent, it can be readily 
seen that for each i the expectation over gi in the right-hand side can be calculated independently from 
the other values gi, I ^ i. The random variable Zi = \gi\^ has the non-central chi-square pdf with two 
degrees of freedom: 

P\9H^i) = ^^ M J (58) 

where /o(-) denotes the modified zero-order Bessel function of the first kind. Using (l58l ) to compute the 
expectation in dSTl ). we have 

where 

oo /*oo y 
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y-i = Zi/a"^^ and (jig. = \iig^\'^ / a'^^. Let us break up the integral (|59l ) into two terms as = f^^^ + fyp 
and use the following results from [36] to approximate Tj: 



e-y^h (2v^) dyi = - + 0{p-^) ^ - (60) 

y yrie-^-/o(2\/te)dy. = £^i(^-') + E^ (61) 

where Ei{q) = — c — log(? — X^^^i ("^i^*^ for q > Q, and c denotes the Euler's constant. Note that if 
logP » 1, then Ei{P~^) w log P. Using the latter fact, from ([59l)-(l6TI) we obtain for the case of large 
P that 

i=l * 9' 



where = Oicr^^ + Defining 



A 



ci T-r e 



n 



2 -I- 4- ajcr^. 



and using the properties of the logarithm, we can rewrite (l62l ) as 



+ (n'?^)^^'')- (63) 



SER < K I^P^ 

For large values of P, the first term in the sum in ( [63l ) will dominate. Hence, Proposition 1 is proved. ■ 
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Fig. 1. Comparison of the approximation and the exact value of E{Ps}. 





Fig. 3. BERs versus P/a^; second example. 
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Fig. 4. BERs versus Pja^; second example. 




Fig. 5. BLER versus P/a^; third example. 




Fig. 6. Geometry of the fourth example. 
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Fig. 7. BER versus PjcP", fourth example. 




Fig. 8. BER versus P/a'^; fifth example. 



